Onwards!
In this report, we focus on the initial set of 99,900 validators controlled by the Ethereum Foundation and the client teams. This report was compiled with data until epoch 2240 (2020-11-28 10:56:00).
We have roughly equal distribution of clients in the network at genesis. The EF operates around 20% of each validator set associated with each client, while the remaining validators are maintained by the team behind the client itself.
We observe a lot more incorrect head attestations when the attestation is made for the starting slot of a new epoch. We name slot_index the index of the slot in the epoch (from 0 to 31).
Attesters get the head wrong whenever the block they are supposed to attest for is late, and comes much after the attestation was published. We can check which clients are producing these late blocks.
Since these late blocks seem to happen more often at the start of an epoch than at the end, it is quite clear that epoch processing is at fault, with some clients likely spending more time processing the epoch and unable to publish the block on time.
We can also check over time how the performance of validators on blocks at slot index 0 evolves, again plotting per client who is expected to produce the block at slot index 0.
Validators attesting on Teku-expected blocks at slot index 0 performed better at a time when the chain experienced difficulty and the number of block produced was lower, around epochs 200 to 300, which lines up with the suggested explanation of long epoch processing times.
In the plots below, we align on the y-axis validators activated at genesis. A point on the plot is coloured in green when the validator has managed to get their attestation included for the epoch given on the x-axis. Otherwise, the point is coloured in red. Note that we do not check for the correctness of the attestation, merely its presence in some block of the beacon chain.
The plots allow us to check when a particular client is experiencing issues, at which point some share of validators of that client will be unable to publish their attestations.
A block can include at most 128 aggregate attestations. How many aggregate attestations did each client include on average?
Smaller blocks lead to healthier network, as long as they do not leave attestations aside. We check how each client manages redundancy in the next sections.
Myopic redundant aggregates were already published, with the same attesting indices, in a previous block.
Subset aggregates are aggregates included in a block which are fully covered by another aggregate included in the same block. Namely, when aggregate 1 has attesting indices \(I\) and aggregate 2 has attesting indices \(J\), aggregate 1 is a subset aggregate when \(I \subset J\).
Lighthouse and Nimbus both score a perfect 0.
We first look at the reward rates per client since genesis.
Clients are hosted on AWS nodes scattered across four regions in roughly equal proportions. We look at the reward rates per region.
Performing an omnibus test to detect significant difference between any of the four groups, we are unable to find such significance at epoch 800. Not long after, an experiment was performed which we describe now.
Around epoch 1020, nodes from regions 1 and 2 were scaled down from t3.xlarge units (4cpu 16GB mem, with unlimited cpu burst) to m5.large units (2cpu, 8GB mem, no burst). We observe a significant loss of performance despite continuous uptime.
Large decreases in all plots below for regions 1 and 2 indicate when nodes were stopped and restarted, circa epochs 1000 for region 1 and epoch 1025 for region 2. When we compare the performance of validators before and after the scaling down of regions 1 and 2, we use epoch 900 as control and epoch 1300 as treatment.
Reward rates per client are affected in roughly equal proportions.
We explore further the difference between clients in regions 1 and 2 and in regions 3 and 4.
It seems that Teku is responsible for most of the reward decrease in regions 1 and 2. Prysm registers a significant, albeit small, decrease in reward rates between the two region groups too.
We look at four metrics across each region:
To obtain a time series, we divide the period between epoch 800 and epoch 1400 in chunks of size 50 epochs. For each validator, we record how many included attestations appear in the dataset (ranging between 0 and 50 for each chunk), the number of correct targets, correct heads and its average inclusion delay. We average over all validators in the EF-controlled set, measuring metrics either per client or per region.
We start by looking at the metrics per region.
Inclusion, target and head correctness all present insignificant differences between the two groups of regions 1 and 2 and regions 3 and 4. However, we observe an increase in the average inclusion delay, which should explain the decreased reward rates for validators in regions 1 and 2.
Teku validators log a higher inclusion delay than others after the switch to smaller containers, as well as worse performance on other duties.